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THE  HUMAN  PERFORMANCE  CENTER 
DEPARTMENT  OF  PSYCHOLOGY 


The  Human  Performance  Center  is  a  federation  of  research 
programs  whose  emphasis  is  on  man  as  a  processor  of  information. 
Topics  under  study  include  perception,  attention,  verbal  learning  and 
behavior,  short-  and  long-term  memory,  choice  and  decision  proc¬ 
esses,  and  learning  and  performance  in  simple  and  complex  skills. 
The  integrating  concept  is  the  quantitative  description,  and  theory, 
of  man's  performance  capabilities  and  limitations  and  the  ways  in 
which  these  may  be  modified  by  learning,  by  instruction,  and  by  task 
design. 

The  Center  issues  two  series  of  reports.  A  Technical  Report 
series  includes  original  reports  of  experimental  or  theoretical 
studies,  and  integrative  reviews  of  the  scientific  literature.  A  Mem¬ 
orandum  Report  series  includes  printed  versions  of  papers  presented 
orally  at  scientific  or  professional  meetings  or  symposia,  methodo¬ 
logical  notes  and  documentary  materials,  apparatus  notes,  and  ex¬ 
ploratory  studies. 
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Learning  theory  as  we  know  it  today  probably  was  founded  in  the  Seven¬ 
teenth  Century,  when  Hobbes  and  Locke  revived  Aristotle's  attack  on  the  doctrine 
of  innate  ideas.  Hobbes  and  Locke  and  other  empiricist  philosophers  took  the 
view  that  knowledge  comes  from  experience.  This  view  requires  a  learning  mech¬ 
anism,  and  the  empiricists  proposed  that  learning  is  r.  process  of  combining 
impressions  that  occur  near  one  another  in  space  and  time,  or  are  similar,  or 
contrast  with  one  another.  Empiricists  argued  for  the  plausibility  of  a  human 
organism  endowed  only  with  elementary  sensory  (and,  presumably,  motor)  capacities. 
Complex  concepts  and  sequences  of  ideas  were  assumed  to  develop  as  combinations 
of  sensory  impressions  Thus,  the  mechanism  of  association  between  ideas 
played  an  important  role  m  the  argument  for  empiricism,  and  was  therefore 
part  of  the  justification  of  the  scientific  method  itself. 

It  seems  safe  to  say  that  the  belief  in  association  as  the  elementary 
learning  event  has  dominated  theories  of  learning  and  thinking  for  at  least 
three  centuries.  The  early  view  that  associations  form  between  ideas  has  been 
replaced  in  this  century  by  the  idea  that  associations  connect  stimuli  and 
responses  But  in  one  form  or  another,  the  hypothesis  of  associationism  has 
enjoyed  nearly  doctrinal  status  for  most  scientific  psychologists.  Most  theo¬ 
rists  interested  in  learning  have  asked  how  associations  are  formed--not 
whether  the  basic  learning  process  might  be  rather  different  from  that  described 
by  association  theory. 

Under  the  presumption  that  all  learning  probably  is  based  on  formation  of 
associations,  paired-associate  memorizing  seems  to  provide  the  paradigm  case 
of  learning  in  its  simplest  and  purest  form  In  the  framework  of  association 


theory,  achievements  of  recall  and  recognition  require  relatively  elaborate 
explanations  learning  to  recall  is  sometimes  viewed  as  the  formation  of 
iiin.t  ions  between  responses  and  some  general  st  i  mul  i  -  -  for  example,  the 
piopeities  of  an  experimental  room.  And  recognition  is  sometimes  sard  to 
depend  at  least  partly  on  a  learned  connection  between  a  stimulus  and  some 
gcnir.il  recognizing  response  which  is  evoked  when  the  stimulus  reappears. 

ITie  discussions  of  recall  and  recognition  included  in  this  volume  do 
mu  enij'tias  i  assoc  i  at  i  onist  ic  ideas.  The  operative  concepts  in  most  of  the 
theories  presented  here  are  encoding,  storage,  and  retrieval  of  items 
Rather  than  asking  how  associations  are  formed  between  stimuli  and  responses, 
most  ot  the  theories  in  this  volume  consider  how  graphic  and  auditory 
stimuli  arc  encoded,  how  records  of  stimuli  are  stored  in  the  subject's 
working  or  acquisition  memory,  and  how  these  records  are  retrieved  and 
used  to  generate  responses  on  tests  of  retention.  The  theory  of  memory 
based  on  concepts  of  storage  and  retrieval  evidently  gives  a  rich  and  illum¬ 
inating  explanation  of  the  processes  of  recall  and  recognition,  as  these 
are  understood  at  present. 

Wc  are  faced  with  an  awkward  theoretical  situation.  For  tasks  involv¬ 
ing  rcuall  or  recognition  of  lists,  concepts  of  storage  and  retrieval  seem 
more  appropriate  than  concepts  of  associative  connection.  But  for  paired- 
assoiiate  memorizing  it  may  seem  simpler  to  theorize  using  concepts  of 
st imulus- response  associations. 

In  this  chapter  I  will  present  evidence  suggesting  that  the  concepts 
ot  st m  >■  <•  and  retrieval  are  also  more  appropriate  than  concepts  of  stimulus- 
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response  connections  for  paired-associate  memorizing.  The  view  to  which  I 
have  been  tentatively  persuaded  is  that  the  task  of  memorizing  associations 
is  not  paradigmatic  for  learning  processes  in  their  simplest  form.  On  the 
contrary,  I  believe  that  paired-associate  memorizing  involves  processes  that 
are  revealed  in  simpler  form  in  experiments  where  subjects  memorize  lists 
for  recall  or  recognition.  I  will  not  try  to  discuss  these  processes  in 
detail--that  is  the  task  undertaken  by  many  other  contributors  to  this  volume. 
What  I  hope  to  do  is  to  present  some  of  the  data  that  encourage  me  to  believe 
that  their  discussions  probably  describe  the  basic  properties  of  paired- 
associate  memorizing 

A  remark  is  needed  to  avoid  a  misinterpretation.  Every  theory  of 
paired-associate  memorizing  has  to  be  an  associative  theory  in  that  it  must 
explain  how  subjects  come  to  learn  correct  responses  for  stimuli.  However, 
classical  association  theory  makes  a  specific  claim  about  the  nature  of  the 
learning  process.  The  theory  that  this  article  disputes  claims  that  stimuli 
and  responses  are  independently  manipulable  units,  and  the  learning  of  an 
association  is  the  formation  of  a  connection  between  otherwise  independent 
mental  entities.  In  situations  that  will  be  considered  here,  the  basic 
process  of  forming  connections  does  not  provide  a  complete  theory,  and  we 
will  be  concerned  with  association  theory  amended  to  include  processes  of 
response  acquisition  and  unlearning  of  interfering  connections 

The  alternative  theory  that  I  will  consider  takes  a  view  of  association 
that  is  basically  Gestalt  in  character.  Kohler  (1941,  p.  493)  expressed  the 
idea  when  he  said,  "Association  is... simply  coherence  within  the  unitary 
trace  of  a  unitary  experience."  I  propose  that  the  first  stage  of  memoriz¬ 
ing  an  association  involves  storing  a  representation  of  the  stimulus -response 
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pair  in  memory  as  a  unit.  Depending  on  the  materials  used,  the  stimulus  or 
the  response  or  both  may  already  be  in  the  subject's  long-term  or  permanent 
memory.  Borrowing  concepts  used  by  Feigenbaum  and  Judith  Reitman  in  this 
volume,  the  process  of  storing  a  pair  results  in  a  structure  which  represents 
the  pair  in  the  subject's  working  or  acquisition  memory. 

In  some  situations,  successful  storage  of  an  item  may  be  all  that  is 
needed  for  successful  retention  But  in  other  situations,  storage  of  an 
item  in  memory  may  not  guarantee  that  the  subject  will  be  able  to  perform 
successfully  on  tests.  In  these  situations,  I  propose  that  the  second 
stage  of  memorizing  involves  learning  to  retrieve  the  stored  item  from 
memory  reliably.  The  process  of  learning  to  retrieve  could  involve 
changing  the  stored  representation  of  an  item,  or  discovering  relation¬ 
ships  among  stored  items  to  permit  better  organization,  or  some  other  process. 

Consider  an  example.  Suppose  that  one  of  the  items  in  a  paired- 
associate  list  is  the  pair  SPIRAL-VIVID.  At  the  beginning  of  the  experi¬ 
ment,  the  subject  has  no  idea  that  these  two  words  are  supposed  to  go 
together--the  item  is  not  known.  Then  at  some  time  the  subject  stores  a 
representation  of  the  pair  SPIRAL-VIVID  in  memory.  When  the  stimulus 
SPIRAL  is  presented  on  tests  there  is  some  chance  that  the  subject  will 
be  able  to  retrieve  the  stored  memory  structure  and  give  the  correct 
response.  But  there  may  also  be  failures  of  retrieval,  due  perhaps  to 
other  stored  items  with  stimuli  similar  to  SPIRAL,  or  to  requirements 
for  fast  responding.  If  the  representation  of  SPIRAL-VIVID  does  not  per¬ 
mit  rapid  and  reliable  retrieval,  then  further  learning  is  needed,  and 
this  is  what  I  am  calling  learning  to  retrieve  Once  a  retrieval  strategy 


for  SPIRAL-VIVID  is  ncqui rent ,  the  subject  will  be  able  to  respond  correctly 
on  tests,  and  the  item  will  be  learned. 

I  an  prirnri  iy  roncr iuv1  to  :  i’- m •  th.it  there  are  two  main  sub-processes 
m  memorization  of  associations,  and  that  these  involve  storage  and  learning 
to  retrieve.  I  am  less  concerned  in  this  paper  with  issues  about  the  exact 
nature  of  storage  and  retrieval  proc-'sses.  However,  some  discussion  of 
possibilities  is  helpful  in  clarifying  the  general  ideas. 

first,  regarding  the  process  of  storing  pairs  in  memory,  N’eisser  (1D67) 
has  argued  that  storage  of  information  should  be  viewed  as  a  constructive 
process  relating  to  a  cognitive  act.  Neisser's  argument  seems  cogent--  the 
mind  cannot  really  be  a  blank  tablet.  Furthermore,  the  nature  of  the  stored 
memory  structure  for  an  item  can  vary  a  great  deal  depending  on  what  the 
subject  does  when  he  studies  it.  For  example,  in  studying  the  pair  SPIRAL- 
VIVID  a  subject  might  form  a  visual  image  of  a  brightly  colored  design  that 
could  appear  on  a  psychedelic  poster.  Or  he  might  construct  an  associative 
mnemonic  such  as  "spiral-viral-vivid".  He  might  select  some  part  of  the 
stimulus,  such  as  its  first  letter  and  code  "S-vivid".  Or  he  might  simply 
rehearse  the  pair  as  it  was  presented.  The  information  stored  by  the  sub¬ 
ject  would  be  different  in  each  of  these  cases,  and  questions  about  the 
form  in  which  information  is  stored  are  very  important  and  interesting.  But 
the  notion  of  storage  as  it  is  used  in  this  paper  is  intended  to  refer  to 
any  representation  of  the  paired  associate  in  memory.  The  important  claim 
is  that  an  item  is  stored  as  a  unit,  rather  than  as  a  connection. 

Now,  suppose  that  an  item  has  been  stored.  On  a  test,  the  subject 
sees  the  stimulus  term  of  the  pair  and  he  has  to  give  the  correct  response. 
There  seem  to  he  two  ways  of  thinking  about  his  problem.  One  common  way  of 
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thir.king  about  memory  involves  an  analogy  with  a  library  or  a  filing  system,  or 
using  Miller's  (1963)  idea,  a  junk  box.  An  item  may  well  be  in  memory  and  not 
be  found  on  a  given  occasion  If  memory  is  like  a  junk  box  or  a  filing  system, 
then  the  process  of  learning  to  retrieve  could  be  accomplished  by  getting  the 
item  separated  from  the  rest  cf  the  contents  of  memory  in  some  way,  or  by 
getting  the  contents  of  memory  organized  in  some  systematic  way  so  the  subject 
knows  where  to  look  for  things. 

There  is  another  way  of  thinking  about  memory  that  may  be  more  realistic. 
Analogies  to  filing  systems  or  junk  boxes  make  memory  seem  spatial,  with 
information  stored  and  waiting  passively  to  be  found.  Another  possibility 
is  that  memory  structures  or  engrams  are  functional  as  well  as  structural 
features  of  the  mind.  On  this  view,  a  stored  memory  structure  becomes  active 
when  an  appropriate  signal  is  received--the  engram  may  be  thought  of  as  wait¬ 
ing  for  its  number  to  be  announced  before  coming  forward.  If  memory  storage 
involves  establishing  engrams  then  the  question  of  retrieval  is  the  question 
of  whether  the  engram  becomes  active  when  the  stimulus  is  presented  on  a  test. 
And  if  it  does  not  with  sufficient  reliability,  then  the  subject  has  to  set  or 
tune  che  engram  more  efficiently  so  that  it  will  be  activated  reliably  by  the 
presentation  of  the  stimulus 

While  these  remarks  about  storage  and  retrieval  processes  are  entirely 
speculative,  they  demonstrate  that  reasonable  general  views  of  the  nature  of 
memory  are  consistent  with  the  claim  that  memorizing  could  easily  involve  two 
stages  that  can  be  called  storage  and  learning  to  retrieve.  Later  sections 
of  this  paper  present  evidence  that  supports  this  conceptualization 
Statistical  Methods 

The  evidence  that  will  be  presented  uses  measurements  of  the  difficulty 

of  learning  m  each  of  two  stages  in  various  paired-associate  memorizing 
experiments  These  measurements  are  obtained  by  estimating  the  parameters  of 


a  Markov  model,  using  results  presented  in  detail  elsewhere  (Greeno,  '968). 
The  model  has  four  states: 


0,  the  state  of  an  item  at  the  beginning  of  an  experiment,  applying 
until  the  item  is  stored  in  memory. 

E,  the  state  of  an  item  which  is  stored  in  memory,  but  a  reliable 
retrieval  strategy  has  not  been  acquired  and  the  subject  fails 
to  retrieve  the  item  from  memory. 

C,  the  state  of  an  item  which  is  stored  in  memory  without  a  reliable 
retrieval  strategy,  but  the  subject  succeeds  in  retrieving  the 
item  from  memory. 

L,  the  state  of  an  item  which  is  stored  in  memory  with  a  reliable 
retrieval  strategy. 

The  initial  and  transition  probabilities  of  the  chain  are1 
PCLj.Ej.Cj.Oj)  =  (t,  (l-s-t)r,  (1-s-t) (1-r) ,  s)  , 


P= 


n+1 


1 

d 

0 

ab 


n+1 


un*l 


n+1 


0 

(l-d)q 
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0 

d-d)p 

P 


a(l-b)e  a(l-b) (1-e) 


0 

0 

0 

1-a 


(1) 


1 

In  this  discussion  I  am  ignoring  the  problem  of  identifiability. 

The  version  of  the  model  given  in  Eq.  1  is  not  identifiable  in  the  form  given, 
but  in  every  application  that  will  he  presented  there  are  acceptable  simplify¬ 
ing  restrictions  tb"  make  Fq .  1  identifiable.  The  assumption  that  P(L  |C  )  » 
0  is  used  as  an  identifying  restriction  here.  In  effect,  it  is  assumedntfiatn 
learning  to  retrieve  stored  items  is  a  process  of  strategy  selection  that  occurs 
only  after  failures  to  retrieve. 
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It  will  be  recognized  that  this  model  ignores  important  temporal  features 
of  the  memorizing  process,  discussed  in  this  volume  by  Norman  and  Rumelhart 
and  by  Judith  Reitman.and  e'sewhere  by  numerous  authors  (e  g  ,  Atkinson  5 
Shiffrin,  1968,  Greeno,  1967;  Peterson,  1966)  Present  evidence  seems  to 
indicate  that  learning  occurs  during  an  interval  of  time  including  and  follow¬ 
ing  the  presentation  of  the  item  to  be  learned.  In  the  experiments  to  be 
discussed  here,  individual  items  were  almost  never  repeated  within  short 
enough  intervals  to  produce  effects  due  to  short  term  memory 

In  the  general  form  of  Eq  1,  the  model  is  a  little  unwieldy.  Some 
simplifications  often  are  acceptable  One  simplification  results  if  the 
first  test  comes  after  a  single  study  trial  on  which  the  transition  parameters 
are  the  same  as  on  later  trials  Then 

t=ab,  r=e,  s  =  1-a  (2) 

Further  simplifications  are  possible  if  the  probabilities  of  acquiring  a 
retrieval  strategy  and  retrieving  stored  items  are  the  same  on  the  first 
trial  after  an  item  leaves  State  0  as  they  are  on  later  trials  In  that  case, 
b  -  d,  e  =  q  (3) 

If  the  simplifications  in  Eqs  2  and  3  are  acceptable,  the  measurements 
of  difficulty  in  the  two  stages  of  learning  are  straightforward  There  are 
just  thr«e  parameters,  a,  d,  and  p  The  value  of  a  measures  the  difficulty 
of  learning  in  the  first  stage  The  value  of  d  measures  the  difficulty  of 
learning  in  the  second  stage  And  the  value  of  p  is  the  probability  of 
retrieving  a  stored  item  from  memory  before  a  reliable  retrieval  strategy  is 
acquired  If  the  simplifications  are  not  all  acceptable  the  measurements  of 
difficulty  in  the  two  stages  of  learning  are  less  simple  However,  summary 
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measures  give  reasonable  indices  of  the  difficulty  in  each  stage.  Let  Zj  be 
the  number  of  trials  spent  in  the  State  0,  and  let  Z ^  be  the  number  of  trials 


spent  in  States  F.  and  ('  The  expected  values  of  these  variables  are 


K<Z1>  = 


K(Z,1  =  (1-s  t) 


,  1-rd 
1+ — 


(pl  J 


s(l-b) 


[»,rj 


lo  obtain  the  measurements  of  difficulty  needed  for  the  analyses  we 
need  estimates  of  the  parameters  of  the  model.  These  can  be  obtained  using 
the  method  of  maximum  likelihood.  Suppose  one  item  shows  a  sequence  of  correct 
responses  (0)  and  errors  (1) 


X=1  1  10010000... 


Using  Hq  1,  the  likelihood  of  X  is 

L(X)  =  (l-s-t)r(l-d)3q3p2d  ♦  sa(l-b)e(l-d) 2q2p2d 

+  s(l-a)a(l-b)e(l-d)qp2d  ♦  s(l-a)"a(l-b) (l-c)qpd 

Of  course,  this  is  only  an  illustration.  The  likelihood  of  any  sequence 
can  be  calculated  using  F.q.  1,  in  a  form  similar  to  the  above  equation. 

Ihe  likelihood  of  all  the  data  is  the  product  of  the  likelihoods  of  the 
separate  sequences.  The  estimates  of  the  parameters  are  those  values  that 
maximize  the  likelihood  of  the  data.  For  the  model  we  are  considering, 
maximuj  likelihood  estimates  cannot  be  obtained  algebraically,  but  the  max¬ 
imum  can  be  found  using  a  computer  search  program.  We  have  used  Stepit 
(Chandler,  1965)  which  uses  only  a  few  seconds  of  computer  time  to  obtain  a 
set  of  estimates. 

lo  determine  whether  one  or  more  simplifications  of  the  model  are 


acceptable,  likelihood  ratio  tests  are  used.  Ihe  procedure  involves  finding 
maximum  likelihood  estimates  of  the  parameters  of  the  general  model,  and 
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then  finding  maximum  likelihood  estimates  of  the  parameters  with  a  restric¬ 
tion  imposed  The  value  of  the  likelihood  obtained  with  the  restriction 
will  be  lower  than  the  maximum  likelihood  obtained  without  the  restriction, 
and  the  ratio  of  the  two  values  (restricted  over  general)  is  called  A 
If  the  restricted  version  is  correct,  the  value  of  -2  logeA  is  asympo- 
totically  distributed  as  chi  square  with  degrees  of  freedom  equal  to  the 
number  of  restrictions  In  the  discussion  that  follows,  when  a  restric¬ 
tion  is  called  acceptable  for  a  set  of  data, this  means  that  the  likelihood 
ratio  rest  for  that  restriction  gave  z  test  statistic  with  probability 
greater  than  05 

The  main  analyses  involve  tests  of  significance  comparing  different 
experimental  conditions  in  the  difficulty  of  the  two  stages  of  learning 
Likelihood  ratio  tests  are  also  used  in  these  analyses  Suppose  for 
example,  that  we  want  to  test  whether  two  groups  differ  in  the  value  of 
a  A  maximum  likelihood  value  is  obtained  for  all  the  data  of  both 
groups,  with  all  of  the  parameters  free  to  vary  A  second  maximum  likeli¬ 
hood  value  is  obtained  with  a  single  value  of  a  used  for  both  sets  of 
data  The  restricted  vaiue  of  the  likelihood  divided  by  the  maximum  like¬ 
lihood  w.thout  the  restrict. or.  gives  a  likelihood  ratio  a  In  this  case 
2  log^  '»  is  asymptotically  distributed  as  chi  square  with  one  degree 
of  freedom  if  the  two  groups  really  have  equal  values  of  a  Tests  can 
be  carried  out  using  more  than  one  parameter,  and  the  degrees  of  freedom 
for  the  ch.  square  d.stribution  equal  the  number  of  parameters  involved 
in  the  test  in  this  way,  we  can  test  whether  two  groups  differ  in  the 
difficulty  of  the  first  stage  of  learning,  or  in  the  difficulty  of  the 
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second  stage  of  learning,  or  in  performance  during  the  intermediate  stage  of 
the  learning  process,  or  in  any  combination  of  these  characteristics. 

Ff feet s  of  Stimulus  and  Response  Difficulty 

Michael  Humphreys  conducted  an  experiment  varying  the  difficulty  of  re¬ 
sponses  and  the  similarity  among  stimuli.  The  materials  he  used  are  listed 
in  Table  1.  The  four  lists  were  learned  by  separate  groups,  using  the  anti¬ 
cipation  method.  Subjects  were  asked  to  spell  the  responses.  Some  summary 
statistics  are  given  in  Table  2.  Note  that  both  the  stimulus  variable  and 
the  response  variable  had  reasonably  strong  effects  in  the  experiment. 

Table  1 

Lists  Used  in  Humphreys'  F.xperimcnt 


IT 

l.'l 

III 

!  1' 1 

1  -hat. 

1  -HIT 

11-RAS 

11 -GPS 

2-.MAK 

2-1 PIV 

12-MAK 

12-IIPF 

3-CAW 

3-NPF 

13-JAV 

13-BPC 

4-RAS 

4 -GPS 

21-BAQ 

21-IPW 

S-BAQ 

S-JPV 

22-IIAZ 

22-NPF 

(v -UN 

6 -MPA 

23-FAG 

23-XPO 

7 - DAP 

8- JAV 

7- BPC 

8- XPO 

31 - DAP 

32- GAW 

31-RPK 

_32_-_MPA 

fable  2 

Summary  Data  for  Humphreys'  Experiment 


Group 

Mean  Frrors  Before 
First  Correct 

Mean  Frrors  After 
First  Correct 

Mean  Trial 
of  Last  Frror 

F.F 

3.  10 

1.17 

5.71 

III 

5 . 0(. 

1.57 

8.01 

i  in 

5.28 

1.85 

8.28 

llll 

6 . 64 

2.82 

11.70 
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The  data  of  this  experiment  allow  us  to  test  the  theory  of  storage  and 
retrieval  learning  Recall  that  in  the  theory,  the  first  stage  of  learning 
is  storage  of  the  stimulus-response  pair  as  a  unit  We  should  expect  that 
this  process  should  be  affected  by  both  stimulus  and  response  vanab  es. 

Then,  in  Eq  1,  the  value  of  a  should  be  influenced  by  both  of  the  variables 
in  Humphreys'  experiment.  On  the  other  hand,  the  theory  says  that  the 
second  stage  involves  learning  to  retrieve  items  reliably  In  Humphreys' 
experiment,  the  mam  difficulty  in  retrieval  might  well  be  elimination  of 
confusion  among  items  with  similar  stimuli  In  this  case,  the  second  stage 
of  learning  should  be  influenced  mainly  by  the  stimulus  variable  In  Eq  1, 
the  values  of  b  and  d  should  be  higher  for  groups  with  easy  stimuli 
than  hard  stimuli,  but  should  not  be  influenced  by  the  response  variable 
Now  suppose  that  the  theory  of  storage  and  retrieval  learning  is 
wrong,  and  associations  are  really  memorized  by  forming  connections  between 
stimuli  and  responses,  A  primitive  version  of  association  theory  would 
not  allow  for  response  effects  at  all,  but  association  theorists  have  ex¬ 
tended  the  theory  to  include  an  additional  process  The  most  comprehensive 
treatment  of  the  extended  theory  is  given  by  Underwood  and  Schulz  (1960) 
in  the  extended  theory,  paired-associate  memorizing  has  two  stages  In 
the  first  stage,  the  subject  acquires  the  response  term  of  the  paired 
associate  For  a  nonsense  syllable  response,  the  response  learning  phase 
probably  would  involve  forming  associations  among  the  components  of  the 
response  For  responses  that  were  already  well  integrated,  the  response 
learning  phase  would  be  a  process  of  increasing  the  availability  of  the  re¬ 
sponse  in  the  experimental  situation--a  process  sometimes  called  formation 


of  a  contextual  association.  The  formation  of  an  associative  connection  or 
hookup  between  the  response  and  its  stimulus  occurs  in  the  second  stage  of 
learning. 

According  to  the  theory  of  response-strengthening  and  hookup  learning, 
the  first  stage  of  paired-associate  memorizing  should  be  affected  mainly  by 
response  variables.  This  means  that  in  Humphreys  experiment,  we  should 
expect  the  value  of  a  in  P.q  .  1  to  he  different  for  groups  with  different 
responses,  but  a  should  not  be  influenced  by  the  stimulus  variable.  In 
Underwood  and  Schulz'  theory,  the  difficulty  of  forming  stimulus-response 
hookups  depends  on  properties  of  both  the  stimuli  and  the  responses.  This 
means  that  in  Humphreys'  experiment  .the  values  of  b  and  d  in  Kq.  1 
might  well  depend  on  both  the  stimulus  and  the  response  variable.  A  summary 
of  the  predictions  suggested  by  the  storage-retrieval  theory  and  the  response- 
hookup  learning  theory  is  given  in  Table  3. 

Table  3 

Summary  of  Predictions  for  Humphreys'  Kxperimcnt 


Parameter _ Storage -Retrieval _ Response-Hooku] 

Depends  on  Stimulus  Depends  only 

and  Response  Variables  on  Response  Variables 

b  and  d  Depends  Only  on  Depend  on  Stimulus 

Stimulus  Variable  and  Response  Variables 


The  main  question,  then,  is  how  the  parameters  of  the  model  varied  de¬ 
pending  on  the  experimental  conditions.  But  this  question  is  not  meaningful 
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unless  the  model  is  approximately  accurate  as  a  description  of  the  learning 
that  went  on  in  the  experiment  We  want  to  use  the  parameter  estimates  as 
psychological  measurements,  and  as  with  any  psychological  measurements  we 
have  to  be  concerned  with  the  question  of  validity  For  example,  the  pre¬ 
dictions  summarized  in  Table  3  depend  partly  on  assuming  that  the  stages 
of  learning  are  approximately  discrete  and  sequential  For  example,  if  the 
response-hookup  learning  theory  were  true,  but  the  stages  overlapped,  then 
the  model  would  be  wrong  but  the  estimate  of  a  would  probably  be  influ¬ 
enced  by  both  stimulus  and  response  variables. 

We  cannot  prove  that  the  measurements  obtained  with  a  model  are 
valid,  because  we  can  never  prove  that  a  model  is  accurate  What  we  can 
do  is  to  perform  tests  that  have  the  possibility  of  rejecting  the  model  if 
it  is  substantially  wrong  Hie  tests  carried  out  in  this  case  involved 
comparisons  between  frequency  distributions  of  statistics  in  the  data 
with  distributions  calculated  using  Eq  1  with  maximum  likelihood  esti¬ 
mates  of  the  parameters 

For  the  groups  in  this  experiment,  the  simplification  given  as 
Equation  3  was  not  acceptable  in  one  of  the  groups  Eq  2  was  acceptable 
Therefore,  the  goodness  of  fit  of  the  model  was  rested  using  maximum  likeli 
hood  estimates  of  five  parameters  a,  b  d,  e,  and  q 

An  illustration  of  the  tests  will  be  given  using  the  group  with 
hard  stimuli  and  easy  responses  Fig  1  shows  the  distribution  of  the 
number  of  errors  made  after  the  first  correct  response  on  end)  item  Fig 
2  shows  the  number  of  trials  between  the  first  correct  response  and  the 
criterion  of  three  consecutive  correct  responses,  which  was  taken  as  show¬ 
ing  learning  The  agreement  between  the  data  and  these  theoretical  distnb 
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ERRORS  AFTER  FIRST  CORRECT 

ig.  1  Theoretical  and  empirical  distributions  of  the  number  of  errors 
after  the  first  correct  response  for  Croup  HE  in  Humphreys’  ex¬ 
periment.  The  histogram  represents  the  data,  and  the  connected 
dots  show  the  theoretical  frequencies. 


Fig.  2  Theoretical  and  empirical  distributions  of  the  number  of  trials 
between  the  first  correct  response  and  the  criterion  for  Group 
HE  in  Humphreys'  experiment. 
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ution  seems  excellent.  These  distributions  involving  performance  after  the 
first  correct  response  have  considerable  importance  because  the  model  says 
that  an  item  has  to  have  completed  the  first  stage  of  learning  before  a 
correct  response  can  occur.  According  to  the  model,  learning  that  occurs 
after  the  first  correct  response  must  be  all-or-none  in  nature.  The 
distributions  shown  in  Figs.  1  and  2  test  this  feature  of  the  data. 

There  are  two  kinds  of  sequences  that  need  to  be  separated  for  pur¬ 
poses  of  estimation;  sequences  that  have  no  errors  after  the  first  correct 
response  and  sequences  that  have  some  errors  after  the  first  correct  re¬ 
sponse.  Fig.  3  shows  the  empirical  and  theoretical  distributions  of  the 
number  of  errors  before  the  first  correct  response  separated  into  components. 


O 


BEFORE  FIRST  CORRECT 

Fig.  3  Theoretical  and  empirical  distributions  of  the  number  of  errors 
before  the  first  correct  response.  The  upper  panel  shows  fre¬ 
quencies  of  sequences  with  no  errors  after  the  fir't  correct 
response,  and  the  lower  panel  shows  frequencies  of  sequences 
with  one  or  more  errors  after  the  first  correct  response. 


The  upper  panel  has  sequences  with  no  errors  after  the  first  correct  response 
For  example,  a  sequence  that  contributes  to  the  fourth  column  in  the  upper 
peine  1  would  be  1  1  1  0  o  u  the  iuu;  i  1  iias  sequences  with  one  or  more 

errors  after  the  first  correct  response  For  example,  a  sequence  contribut¬ 
ing  to  the  fourth  column  in  the  lower  panel  might  be  11100101000.. 
The  agreement  in  Fig.  3  is  not  as  striking  as  in  Figs.  1  and  2,  partly 
because  these  distributions  are  based  on  fewer  cases.  But  it  is  still 
satisfactory. 

Figs  4  and  5  show  the  distributions  of  errors  and  trials  of  the  last 
error  for  all  trials.  In  effect,  these  test  the  assumptions  in  the  model 
about  how  the  distributions  in  Figs.  1  and  2  combine  with  the  distribution 
in  Fig  3  These  empirical  distributions  were  not  smooth,  but  the  theore¬ 
tical  curves  seem  to  follow  the  main  contours  of  the  data  fairly  well. 

"lhe  results  shown  from  Group  HF.  do  not  include  the  cases  of  greatest 
disagreement  between  data  and  theory,  but  they  do  not  include  the  best 
cases  either  In  any  event,  the  real  question  of  the  model's  validity 
depends  on  the  overall  agreement  between  all  the  empirical  distributions 
and  all  the  predicted  distributions.  Because  maximum  likelihood  estimates 
of  the  parameters  were  used,  we  know  something  about  the  distributions  of 
goodness-of-fit  chi  square  statistics  Let  n  be  the  number  of  cells  in  a 
frequency  distribution,  and  let  m  be  the  number  of  parameters  estimated 
from  the  data  and  used  in  calculating  the  theoretical  distribution,  then 

the  asymptotic  distribution  of  the  chi  square  statistic  is  bounded  by 

2  2 

X  (n-1)  and  x  (n-m-1)  (Chemoff  and  Lehman,  1954)  For  the  four  experi¬ 
mental  groups,  a  total  of  20  chi  square  tests  were  carried  out  One  of 
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Fig.  4  Theoretical  and  empirical  frequencies  of  the  total  number  of 
errors . 


Fig.  5  Theoretical  and  empirical  frequencies  of  the  total  number  of 
trials  before  criterion. 


them  was  significant  at  the  .05  level  using  the  upper  bounds  of  degrees  of 
freedom,  and  three  were  significant  using  the  lower  bounds.  Statistically, 
then,  the  predictions  of  the  model  weem  to  agree  to  an  acceptable  approxi¬ 
mation  with  the  data  At  least,  we  probably  can  have  reasonable  confidence 
that  the  parameter  values  and  tests  of  hypotheses  about  parameters  using  the 
model  will  not  be  grossly  misleading 

Now  recall  that  the  main  target  of  the  analysis  is  to  obtain  evidence 
for  a  choice  between  two  theories  of  memorizing.  One  theory  says  that  the 
first  stage  is  response  learning,  and  should  be  hard  or  easy  depending  on 
the  responses  that  have  to  be  learned  Another  theory  says  that  the  first 
stage  is  storage  of  the  stimulus-response  item,  and  should  depend  on  both 
the  stimulus  and  response  variable.  If  the  response-hookup  theory  is  cor¬ 
rect,  we  should  find  that  a  can  be  held  constant  across  groups  with  the 
same  responses.  But  if  the  storage-retrieval  learning  theory  is  correct, 
then  values  of  a  probably  should  depend  on  stimulus  as  well  as  response 
variables  Table  4  has  the  results  of  testing  the  invariance  of  a 
across  pairs  of  conditions,  using  likelihood  ratio  tests  For  example, 
one  null  hypothesis  is  that  a  has  the  same  value  in  groups  EE  and  HE  -- 
the  two  groups  with  easy  responses  The  test  statistic  was  5  97,  which 
has  probability  015  under  the  null  hypothesis,  indicating  rejection  of 
the  null  hypothesis.  A  similar  result  was  obtained  for  the  test  of  in¬ 
variance  of  a  across  groups  EH  and  HH  --  the  two  groups  with  hard  re¬ 
sponses  The  tests  involving  groups  with  the  same  stimuli  are  included 
for  completeness--they  permit  rejection  of  the  hypothesis  of  invariance 
even  more  strongly  Since  the  groups  with  the  same  responses  cannot  be 
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described  with  the  same  values  of  a  ,  the  results  in  Table  4  favor  the 
storage-retrieval  theory  over  the  response-hookup  learning  theory 

Table  4 

Tests  of  Invariance  of  a 


Conditions 

-2  log  k 

P _ 

EE 

VS 

HE 

5 

,97 

015 

EH 

vs 

IlH 

6 

69 

.010 

EE 

vs 

EI1 

14 

23 

0002 

HE 

vs 

Hll 

16 

,23 

.00006 

The  other  test  involves  the  prediction  suggested  by  the  storage-retrieval 
theory  about  b  and  d  If  the  second  stage  of  memorizing  is  learning 
to  retrieve  stored  items,  then  b  and  d  should  depend  on  the  stimulus 
variable,  but  not  on  the  responses.  But  if  the  second  stage  of  memorizing 
is  formation  of  a  stimulus-re'Donse  connection,  then  the  values  of  b  and 
d  probably  should  depend  on  both  stimulus  and  response  variables  The 
result  to  be  reported  uses  the  data  from  all  four  groups  In  addition  to 
testing  invariance  of  b  and  d  ,  we  test  the  hypothesis  that  b  and  d 
were  equal  The  theory  is  required  to  fit  the  data  of  all  four  groups 
with  any  value  of  a  in  each  group,  one  value  of  b  and  d  for  groups 
HE  and  Ell,  and  a  different  value  of  b  and  d  for  groups  HE  and  Mil  The 
performance  parameters  p  and  e  were  allowed  to  vary  freely  The  null 
hypothesis  is  that  b  and  d  were  equal,  and  depended  only  on  the  stimulus 
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variable  The  alternative  hypothesis  is  that  all  the  parameters  including 
b  and  d  differed  among  all  four  groups,  The  test  has  six  degrees  of 
freedom 

The  result  of  the  test  is  in  Table  5.  The  value  obtained  for  -2  log  A 
was  4.37,  which  has  probability  greater  than  .60  under  the  null  hypothesis. 
What  we  found  in  the  statistical  analysis  is  that  we  can  reject  the  hypo¬ 
thesis  of  equal  values  of  a  across  groups  with  the  same  responses,  but 
we  cannot  reject  the  hypothesis  of  equal  values  of  b  and  d  across  groups 
with  the  same  stimuli  This  fits  with  expectations  based  on  the  storage- 
retrieval  learning  theory,  and  thus  favors  a  choice  of  that  theory  over  the 
Theory  of  response  st  ror.gt hen  i m>  :t n < *  hookup  learning 

Table  5 

Parameter  F.stimates  and  -2  log  A  Testing  b  =  d, 

Depending  Only  on  Stimulus  Difficulty 


:ondition 

a 

b=d 

P 

1-e 

EE 

.29 

,34 

46 

.32 

EH 

18 

34 

36 

.62 

HE 

.21 

26 

.40 

34 

III! 

13 

26 

36 

90 

Note  --  -2  log  A  =  4.37  ,  p  >  .60  . 

The  results  of  Humphreys’  experiment  have  been  presented  using  the 
literary  device  of  giving  the  hypotheses  first  and  then  the  data.  This  was 
lone  for  reasons  of  clarity,  rather  than  historical  accuracy  Actually, 
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Humphreys  and  I  had  cxm-ftcd  to  obtain  a  confirmation  of  Underwood's  theory 
when  we  began  the  analysis,  because  we  had  not  thought  of  any  reasonable 
alternative  to  it.  We  developed  the  theory  of  storage  and  retrieval  learn¬ 
ing  because  the  data  seemed  to  disagree  with  Underwood's  theory,  at  least 
in  the  simplified  form  that  we  were  considering  When  a  new  hypothesis 
is  developed  because  of  a  complicated  statistical  result,  it  is  wise  to 
replicate  the  study  This  was  done  at  Indiana  in  an  experiment  carried 
out  with  the  assistance  of  Herbert  Marsh  We  used  the  same  design  as 
Humphreys  did,  but  different  materials  and  procedures  were  used  The 
lists  learned  by  the  subjects  are  given  in  Table  6  Note  that  the  lists 
were  shorter  (six  instead  of  nine  items),  the  stimuli  were  letters  rather 
then  numbers,  and  the  responses  were  words  rather  than  nonsense  syllables 
Whereas  Humphreys'  experiment  was  run  using  a  memory  drum  with  subjects 
speaking  their  responses,  our  replication  was  run  in  a  computer-based 
laboratory  with  stimuli  presented  on  crt  displays  and  responses  typed 
on  keyboards.  Table  7  shows  summary  data  for  the  replication  of  Humphreys' 
experiment  Apparently  the  changes  in  materials  and  procedures  did  not 
eliminate  the  overall  differences  due  to  stimulus  similarity  and  response 
difficulty,  although  the  effect  of  response  difficulty  seems  to  have  been 
smaller  here  than  in  Humphreys'  data 

In  testing  simplifying  assumptions  of  the  general  model,  we  found 
that  the  simplifications  of  Eq  3  were  acceptable  only  for  groups  EE  and 
EH  The  simplifications  of  Eq  2  were  acceptable  for  group  Hll,  and  neaily 
acceptable  for  group  HF.  (  025  •  p  '  05);  Eq  2  was  not  acceptable  for  groups 
EE  and  EH  Rather  then  applying  the  model  in  its  most  general  (and  weakest) 
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Table  6 

Lists  Used  in  Replication  of  Humphreys'  Experiment 


EE 

EH 

HE 

HH 

P--Touch 

P- -Del  ft 

FQ--Touch 

FQ--Delft 

V--Night 

V--Blear 

VF--Night 

VF--Renal 

F--Grain 

F--Renal 

VQ--Grain 

VQ--Anode 

C--Stand 

C--Houri 

QV--Stand 

QV--Houri 

L--Earth 

L--Ingot 

QF--F.arth 

QF-- Ingot 

S--0ffer 

S--Anode 

FV- -Offer 

FV--Blear 

Table  7 

Summary  Data  for  Replication  of  Humphreys'  Experiment 


Group 

Mean  Errors  Before 
First  Correct 

Mean  Errors  After 
First  Correct 

Mean  Trial  of 
Last  Error 

F.E 

2,66 

1  59 

4  56 

HH 

3  93 

1  16 

5.84 

HE 

5  02 

5  M 

13.68 

HH 

6.19 

4  .S3 

13.97 

form,  we  used  the  model  with  the  restrictions  that  were  acceptable  in  the 
various  groups  The  model  did  not  fit  as  well  in  this  experiment  as  it  did 
for  Humphreys'  data.  Of  20  tests  of  goodness  of  fit,  six  could  be  rejected 
at  the  05  level  using  upper  bounds  on  degrees  of  freedom,  and  eight  could 
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Ke  rejected  at  05  using  the  lower  bounds  For  many  purposes,  this  amount 
of  discrepancy  would  be  unsatisfactory,  but  it  probably  is  all  right  in  this 
case  since  we  were  only  concerned  to  see  whether  the  pattern  of  results  in 
Humphreys'  study  would  appeal  again 

Table  8  gives  the  estimated  parameter  values  for  the  four  experiment¬ 
al  groups  Since  different  simplifying  restrictions  applied  in  the  different 
groups,  the  parameters  are  not  comparable  in  simple  ways.  In  order  to  ob¬ 
tain  summaries  that  are  comparable,  the  mean  numbers  of  trials  m  each  stage 
were  calculated  using  Kq  4  These  figures  are  also  given  in  Table  8  Note 
that  the  mean  number  of  trials  in  the  first  stage  seems  to  have  been  influ¬ 
enced  by  both  the  stimulus  and  response  variables,  as  was  true  in  Humphreys' 
data  In  this  study,  however,  the  effect  of  the  stimulus  variable  seems  to 
have  been  somewhat  stronger  than  the  effect  of  the  response  variable  The 
number  of  trials  required  to  complete  the  second  stage  seems  to  have  been 
determined  mainly  by  the  stimulus  variable,  as  was  true  in  Humphreys' 
experiment  Thus,  the  mam  conclusions  that  were  made  on  the  basis  of 
Humphreys'  data  seem  to  have  been  corroborated  in  our  replication 

Table  8 

Parameter  1st imntes  and  Theoretical  Mean  Numbers  of  Trials 
in  Each  Stage  in  Replication  of  Humphreys'  Experiment 


Group 

a 

b 

d 

e 

q 

r 

S 

t 

E(Z1) 

e(z2) 

EE 

n 

33* 

33 

73* 

73 

83 

.06 

24 

1  49 

3  00 

EH 

17 

30* 

30 

75* 

75 

85 

27 

14 

2  55 

3  34 

HE 

2  b 

13 

14 

35 

(>8 

35* 

74* 

03* 

3  90 

9  75 

IIH 

l  ; 

0<> 

17 

34 

69 

34* 

82* 

01* 

5  51 

8  51 

Note  --  these  parameters  were  determined  by  simplifying  restrictions 
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It  should  be  remembered  that  the  conclusion  of  the  analysis  may  depend 
on  accepting  the  validity  of  the  measurements  based  on  the  Markov  model,  in¬ 
cluding  the  assumption  of  discrete  stages.  The  analyses  reported  here  were 
carried  out  using  the  only  two-stage  model  for  which  statistical  methods 
have  been  worked  out.  It  is  possible  that  use  of  other  models  might  lead 
to  different  conclusions.  However,  if  the  present  analysis  is  accepted,  the 
conclusion  based  on  these  experiments  with  varying  stimulus  and  response 
difficulty  is  that  the  first  stage  of  paired-associate  memorizing  is  affected 
by  characteristics  of  both  stimuli  and  responses,  but  the  difficulty  of  the 
second  stage  seems  to  depend  almost  entirely  on  the  stimuli.  This  supports 
the  storage-retrieval  theory,  since  it  is  consistent  with  the  idea  that 
subjects  store  the  stimulus-response  pair  as  a  unit,  and  then  have  to  de¬ 
velop  strategies  to  retrieve  the  stored  items  from  memory  when  they  see  the 
stimulus  terms. 

Analysis  of  Negative  Transfer 

The  data  to  be  presented  in  this  section  were  obtained  in  experiments 
conducted  by  Carlton  James  where  prior  training  produced  negative  transfer 
m  paired-associate  memorizing  The  experiments  involve  comparisons  between 
two  conditions  One  group  learned  two  lists  with  the  same  responses  but 
different  stimuli.  This  is  called  the  A-B,  C-B  paradigm,  and  will  be  re¬ 
ferred  to  here  as  the  C-B  condition.  The  other  group  learned  two  lists  with 
the  same  stimuli  and  responses,  but  each  stimulus  was  paired  with  a  different 
response  in  the  second  list  than  it  was  in  the  first  list  This  is  called 
the  A-B,  A-Br  paradigm,  and  will  be  referred  to  here  as  the  A-Br  condition. 

fn  these  studies,  the  storage-retrieval  theory  cannot  be  compared  with 
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the  response-hookup  theory  The  reason  is  that  in  both  the  A-B^  and  the  C-B 
conditions,  the  resoonses  used  in  the  second  list  are  the  same  as  those  used 
in  the  first  list,  so  there  should  be  no  effect  due  to  response  strengthening 
However,  the  theory  of  association  learning  includes  a  different  factor  which 
should  differ  between  the  two  conditions  of  these  experiments  In  a  theory 
that  dates  from  Melton  and  Irwin's  (1940)  study  of  retroactive  interference, 
negative  transfer  is  explained  by  the  effect  of  associations  that  are  learned 
in  the  first  list  and  must  be  unlearned  before  the  new  associations  can  dom¬ 
inate  performance  In  an  A-Br  condition,  where  the  stimuli  are  the  same  as 
those  learned  in  the  first  list,  the  effect  of  first-list  associations  should 
be  quite  strong  and  retard  learning  by  a  large  amount  In  a  C-B  condition, 
new  stimuli  are  used  in  the  second  list  and  the  first-list  associations 
should  have  a  much  smaller  eiT-ct 

We  can  construct  a  version  of  the  unlearning  theory  that  would  fit 
with  the  two-stage  Markov  model  Keep  in  mind  that  in  these  experiments  the 
suOjc:t  knows  the  responses  from  the  beginning  of  training  on  the  second 
list,  since  they  are  the  same  as  those  used  earlier  This  means  that  any 
reasonable  two- stage  theory  should  assert  that  both  stages  of  learning  in¬ 
volve  learning  the  associations  in  the  second  list  Suppose  that  in  State 
0,  the  association  for  a  stimulus  from  List  1  is  retained  and  dominates 
the  subject's  performance  on  that  item  The  item  goes  from  State  0  to  either 
State  F.  or  State  C  when  the  lirst  list  association  is  unlearned  The  trans¬ 
ition  to  State  L  occurs  when  the  second- list  association  is  learned  Accord¬ 
ing  to  this  conceptualization,  the  main  difference  between  A-B  and  C-B 
conditions  should  be  a  difference  in  the  difficulty  in  accomplishing  the  first, 
unlearning  stage  of  the  memorizing  process 


1u  storage -retrieval  theory  suggests  a  different  expectation.  The 
task  given  to  an  A-B^  group  is  to  learn  to  use  each  stimulus  from  the  first 
list  to  retrieve  i  response  that  is  different  from  the  one  paired  with  it 
originally  In  the  i.-B  group,  new  stimulus  cues  are  used  This  leads  to 
the  expectation  that  the  main  difficulty  in  A-B^,  relative  to  OB,  should 
be  in  learning  to  retrieve  the  new  pairs  from  memory,  and  the  theory  says 
that  this  occurs  in  the  second  stage  of  paired-associate  learning. 

Data  ivece  obtained  from  a  variety  of  conditions  In  one  experiment 
each  list  contained  ten  pairs  of  two-sy 1 1  ah le  adjectives,  with  two  groups 
(an  A-Ur  and  a  OB  group)  learning  the  first  list  to  a  criterion  of  one 
perfect  .ri.-l.  (No  71')  and  the  tuer  t .  o  ".roups  learning  the  first  list  to 
the  one-trial  criterion  and  then  receiving  15  additional  trials  of  over- 
t ranting  (.Of)  fa  another  experiment,  each  list  contained  six  pairs  of  two- 
syllable  adjectives  There  were  eight  groups  in  a  2  x  2  x  2  factorial 
design  One  factor  was  the  main  variable--the  difference  between  A-B 

r 

and  OB  conditions  A  second  factor  was  the  presence  or  absence  of  a 
series  of  pretraining  lists  (PT  or  No  PT)  each  with  the  same  responses  as 
illume  used  in  the  last  two  lists  but  with  different  stimuli,  and  each 
studied  for  six  trials  And  the  third  factor  was  the  presence  or  absence 
of  18  trials  of  overtraining  on  the  next-to-last  list  following  a  criterion 
oi  one  perfect  trial  (OT  or  No  OT) . 

These  experiments  were  carried  out  using  a  memory  drum  with  the 
anticipation  procedure  Stimuli  were  prcsc.itcd  for  2  see  during  which 
the  subject  tried  to  give  the  correct  response  Then  the  response  was 
shown  along  with  the  stimulus  for  2  sec.  There  was  a  4  sec  pause  between 
each  cycle  in  which  al 1  the  items  were  presented. 
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In  all,  there  were  12  experimental  groups  for  this  analysis  The 
simplifying  assumption  involving  the  initial  vector  of  the  model  (Eq  2) 
was  acceptable  in  all  the  groups  Although  other  simplifications  were 
acceptable  in  some  groups,  they  were  not  used  in  testing  goodness  of  fit 
or  estimating  the  parameters  of  the  model  The  same  five  tests  for  goodness 
of  fit  were  used  here  as  m  the  analyses  described  earlier  With  12  groups, 
there  were  60  tests  Three  tests  were  significant  using  upper  bounds  of 
the  degrees  of  freedom,  and  15  tests  were  significant  using  lower  bounds 
Thus,  the  model  seems  to  have  fit  these  data  reasonably  well 

The  theoretical  measures  of  difficulty  for  the  first  and  second  stages 
of  learning  are  given  in  Table  9  The  values  of  E(Zj)  for  comparable  C-B 
and  A-B^  groups  seem  to  show  small  and  inconsistent  differences,  except 
for  the  condition  with  ten  items  and  overtraining  However,  the  measures 
of  difficulty  in  the  second  stage  show  large  and  consistent  differences .with 
A-B^  havir\,  greater  difficulty  in  the  second  stage  in  every  case 

Table  9 

Theoretical  Quantities  for  A-Br  and  C-B  Conditions 


Condition  Mean  Trials  in  Mean  Trials  in 

First  Stage _ Second  Stage 


C-B 

A-B 

r 

C-B 

A- 

-B 

r 

Ten 

Items , 

No  0T 

3.94 

4  54 

2  76 

4 

12 

Ten 

Items , 

OT 

3  59 

6  4‘> 

2  95 

10 

05 

Six 

Items , 

No  PT, 

No  OT 

2  99 

2  58 

1  2S 

5 

08 

Six 

Items , 

No  PT, 

OT 

2  58 

3  19 

2  70 

4 

35 

Six 

Items , 

PT,  No 

OT 

1  87 

1  74 

)  55 

3 

80 

Six 

Items , 

PT,  OT 

2  48 

2  92 

.1  22 

3 

59 
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Statistical  tests  were  carried  out  to  compare  the  difficulty  of  learn¬ 
ing  in  the  A-Br  and  C-B  conditions,  using  separate  likelihood  ratio  tests 
for  the  two  stages.  The  results  are  in  Table  10.  Note  that  in  every 
case,  the  difference  in  the  second  stage  was  significant,  but  the  difference 
in  the  first  stage  was  significant  only  in  one  of  the  six  comparisons. 

These  results  seem  to  justify  the  conclusion  that  the  main  difference 
between  learning  A-B^  and  C-B  lists  occurs  in  the  second  stage  of  memorizing. 

The  results  of  this  analysis  provide  additional  support  for  the 
hypothesis  that  paired-associate  memorizing  involves  storage  and  learning 
to  retrieve.  The  hypothesis  of  unlearning  and  replacement  of  associative 
connections  leads  us  to  expect  most  of  the  difference  between  A-Br  and 
C-B  o  occur  in  the  first  stage.  However,  in  five  of  six  conditions  we 
failed  to  find  a  significant  difference  in  the  first  stage.  In  the 
hypothesis  of  storage  and  retrieval  learning,  it  is  reasonable  to  expect 
the  main  difficulty  in  A-Br  to  involve  retrieval  learning,  and  this  ex¬ 
pectation  is  consistent  with  the  finding  that  most  of  the  difference 
between  A-Br  and  C-B  was  in  the  second  stage  of  learning. 

Table  10 

Tests  of  Invariance  between  A-B  and  C-B 

r 


Condition _ First  Stage _ Second  Stage _ Both  Stages 


Ten 

Items,  No  0T 

1 .4 

20.4*** 

21.4*** 

Ten 

Items,  OT 

29.8*** 

59.5*** 

88.1*** 

Six 

Items,  No  PT, 

No  OT 

1.1 

19.7*** 

20.7*** 

Six 

Items,  No  PT, 

OT 

1.6 

9.4*** 

9.6*** 

Six 

Items,  PT,  No 

OT 

0.1 

16 . 3*** 

29.9*** 

Six 

Items,  PT,  OT 

1.9 

11.0*** 

32.2*** 

Note  --  ***denotes  p  <  .01  . 
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Summary  and  Conclusion 

I  began  this  article  by  stating  a  theoretical  question--whether  assoc¬ 
iations  are  memorized  by  a  process  of  forming  connections  between  stimuli  and 
responses  or  by  a  process  of  storing  stimulus-response  units  and  learning  to 
retrieve  then .  The  preceding  two  sections  have  presented  evidence  that  the 
storage-retrieval  theory  is  a  more  reasonable  hypothesis  about  the  memorizing 
process.  The  evidence  consists  of  results  obtained  by  measuring  difficulty 
of  two  learning  stages  in  various  experimental  conditions,  using  a  Markov 
model  with  the  assumption  that  learning  occurs  in  two  discrete  stages. 

First,  it  seems  that  the  similarity  among  stimuli  is  quite  a  strong 
variable  in  determining  the  difficulty  of  the  first  stage  of  learning  The 
difficulty  of  the  first  stage  is  also  affected  by  response  variables.  Dif¬ 
ferences  were  obtained  by  varying  the  pronouncibility  of  trigram  responses 
and  by  varying  the  frequency  of  use  of  word  responses.  The  first  stage  of 
memorizing  was  not  affected  in  one  important  case  --  five  of  six  comparisons 
between  A-Br  and  C-B  negative  transfer  conditions  failed  to  show  a  reliable 
difference  in  the  first  stage  of  learning. 

The  second  stage  of  learning  was  strongly  influenced  in  these  experi¬ 
ments  by  the  similarity  ainoii"  stimuli,  and  large  differences  in  difficulty 
of  the  second  stage  were  obtained  in  comparisons  between  A-li  ami  A-C  nega¬ 
tive  transfer  conditions.  These  experiments  have  consistently  failed  to 
show  effects  on  the  second  stage  of  memorizing  due  to  response  variables. 
Pronouncibility  of  tri grams  and  frequency  of  words  both  failed  to  produce 
reliable  second  'tage  differences  in  these  data. 

If  tne  measurements  presented  here  arc  accepted,  the  findings  seem 
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very  hard  to  explain  using  the  theory  of  stimulus-response  connections.  Re¬ 
garding  the  first  stage  of  learning,  sizable  effects  were  found  where  the 
theory  of  connections  predicts  little  or  no  effect,  and  effects  were  not 
found  where  the  theory  leads  us  to  expect  them.  Specifically,  the  version  of 
association  theor\  that  says  the  first  stage  is  mainly  a  process  of  increas¬ 
ing  response  availability  leads  us  to  expect  little  or  no  effect  of  stimulus 
variables  in  the  first  stage.  Yet,  the  stimulus  variables  manipulated  in 
these  studies  influenced  the  first  stage  of  learning  significantly.  On  the 
othei  hand,  the  version  of  association  theory  that  says  old  associations 
have  to  be  unlearned  before  new  associations  can  dominate  performance  leads 
us  to  expect  a  substantial  first-stage  difference  between  A-B^  and  C-B 
negative  transfer  conditions.  But  all  except  one  of  our  experimental 
conditions  failed  to  show  this  effect. 

Regarding  the  second  stage  of  learning,  connection  theorists  often 
suggest  (and  sometimes  state  outright)  that  the  formation  of  connections 
probably  comes  after  other  processes  like  response  strengthening  or  unlearn¬ 
ing  have  taken  place.  And  the  formation  of  connections  is  often  treated  as 
a  relatively  symmetrical  process,  which  would  be  expected  to  be  influenced 
about  as  much  by  response  variables  as  by  stimulus  variables.  However,  in 
the  date  reported  here  the  second  stage  of  learning  was  affected  almost  ex¬ 
clusively  by  stimulus  variables.  Stimulus  similarity  had  strong  effects  on 

the  second  stage  of  learning  and  the  difference  between  A-B  and  C-B  conditions 

r 

was  mainly  a  second-stage  effect.  Response  pronouncibi lity  and  frequency  of 
word  use  failed  to  show  significant  effects. 

On  the  other  hand,  the  theory  of  storage  and  retrieval  learning  has 


■  .nines  that  seen  to  be  quite  consistent  with  the  pattern  of  results  obtained 

in  those  studies ,  hirst,  the  tact  that  both  stimulus  and  response  variables 

.u feet  the  first  stage  of  learning  seems  to  support  the  idea  that  the  first 

stage  is  just  the  storage  of  the  stimulus-response  pair  as  a  unit  the 

fact  that  \-H  and  T-ii  conditions  usually  did  not  differ  in  the  first  stage 
r 

does  not  seem  so  surprising  if  the  first  stage  is  storage  in  memory  -- 
after  all,  both  groups  of  subjects  had  the  same  material  to  store.  And 
the  failure  of  response  variables  to  have  important  effects  on  the  second 
stage  of  learning  seems  consistent  with  the  idea  that  the  second  stage 
is  a  process  of  learning  to  retrieve  The  subject  must  learn  to  retrieve 
each  item  using  the  stimulus  as  a  cue.  Therefore,  similarity  among  the 
stimuli  and  previous  use  of  the  stimuli  to  retrieve  different  pairs  probably 
should  make  the  process  of  learning  to  retrieve  more  difficult. 

The  main  conclusion  of  this  paper  is  that  basic  concepts  in  a  theory 
of  paired-associate  memorizing  should  he  storage  and  retrieval,  rather  than 
the  concepts  of  traditional  association  theory.  The  present  data  are 
certainly  insufficient  to  support  a  firm  conclusion  on  a  fundamental  theor¬ 
etical  question  However,  to  the  extent  that  a  conclusion  is  supported,  the 
conclusion  seems  to  be  that  the  theory  of  memory  has  no  need  for  a  concept 
describing  a  process  of  association  in  the  sense  of  connection  between 
mental  elements  The  processes  of  information  storage  and  retrieval  which 
seem  most  adequate  for  handling  recall  and  recognition  memory  also  seem  to 
be  favored  for  the  theory  of  memory  for  associations 
ko lat lonsh 1 p  with  Other  Theories 

I  have  gone  to  considerable  effort  to  emphasize  differences  between 
the  storage-retrieval  theory  and  the  traditional  theory  of  associative 
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connections.  I  want  to  conclude  by  pointing  to  some  consistencies  between 
the  theory  used  here  and  others  that  have  been  developed  recently 

Perhaps  the  clearest  relationship  exists  between  the  present  two- 
stage  theory  and  the  all-or-none  model  of  memorizing  (Bower,  1961;  F.stes, 

1960;  Rock,  1957).  While  the  all-or-none  hypothesis  postulates  a  single 
discrete  step  in  learning,  the  present  analysis  assumes  two  such  steps. 

Aiiu  the  statistical  machinery  used  in  the  present  analyses  is  a  direct 
extension  of  that  used  in  the  all-or-none  analyses  (especially  by  Bower,  1961). 

The  two-stage  model  of  Eq.  1  can  be  viewed  as  a  generalization  of  the 
all-or-none  •  oi'.-'l.  Suppose  in  Eq.  1  that  b  =  1.  In  the  interpretation  of 
this  article,  this  would  mean  that  once  an  item  is  stored  in  memory,  it  can 
be  retrieved  reliably  enough  to  meet  the  experimental  criterion  of  learning. 

On  this  interpretation,  learning  should  he  approximately  all-or-none  in 
cases  where  retrieval  is  easy.  And  this  seems  to  fit  with  the  facts. 

Typically,  experiments  showing  all-or-none  results  use  short  lists  of  items 
and  two  or  three  response  alternatives  that  were  known  by  the  subjects 
at  the  beginning  of  the  experiment.  The  experimental  task  then  is  very  close 
to  a  sorting  task,  where  there  arc  two  or  three  categories  and  the  subject 
must  learn  which  category  each  stimulus  belongs  in  As  the  number  of  cate¬ 
gories  or  the  number  of  items  in  each  category  increases,  retrieval  should 
become  more  difficult,  and  we  should  expect  data  to  depart  from  the  all-or- 
none  model.  And  data  often  seem  to  be  consistent  with  this  expectation 


-34- 


A  two-stage  Markov  model  similar  to  Eq.  1  was  analyzed  by  Bower  and 
Theios  (1964),  and  they  demonstrated  that  the  idea  of  two  discrete  learning 
steps  was  consistent  with  data  from  several  experiments  These  studies 
included  experiments  by  Theios  where  subjects  memorized  associations  and 
had  to  adjust  to  changes  in  the  correct  responses  for  individual  items 
Kintsch  (1963)  applied  the  two-stage  model  successfully  to  the  results  of 
a  paired-associate  experiment,  but  he  interpreted  the  stages  as  response 
learning  and  association-forming,  an  interpretation  that  seems  to  be 
questionable  in  the  light  of  results  reported  here.  Another  application 
by  Kintsch  and  Morris  (1965)  involved  recognition  and  free  recall  learning, 
but  was  consistent  conceptually  with  the  present  argument  Kintsch  and 
Morris'  data  supported  the  idea  that  when  subjects  memorize  a  list  of  words, 
the  first  stage  of  learning  an  item  permits  the  subjects  to  recognize  the 
item  and  the  second  stage  permits  him  to  recall  the  item.  Storage  and 
retrieval  seem  like  acceptable  alternative  names  for  these  two  subprocesses 
Restle  (1964)  also  proposed  a  two-stage  Markov  model  as  an  extension 
of  the  all-or-none  theory  Restle  proposed  a  trace  theory  in  which  learning 
consisted  of  acquiring  strategies  enabling  the  subject  to  recall  traces  In 
the  first  stage  of  learning,  a  subject  becomes  able  to  recall  the  response 
for  an  item,  and  m  the  second  stage  he  discriminates  that  item  from  other 
items  similar  to  it  in  the  list  Restle's  theory  is  like  the  present  theory 
in  that  mnemonic  records  are  assumed  to  represent  experiences,  rather  than 
connections  And  Restle's  hypothesis  about  the  second  stage  of  learning  as 
discrimination  seems  indistinguishable  from  the  present  view  of  learning 
to  retrieve  Restle  was  not  entirely  clear  about  the  nature  of  the  first 
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stage  of  leaming--he  called  it  "learning  to  recall  a  response,"  but  other 
aspects  of  his  theory  make  it  seem  as  though  stimulus  variables  probably  would 
influence  the  process. 

The  hypothesis  presented  here  bears  an  interesting  relationship  to  a 
recent  theory  by  Martin  (1968).  In  Martin's  theory,  a  major  factor  in  memor¬ 
izing  an  association  is  variability  in  encoding  the  stimulus.  An  hypothesis 
consistent  with  Martin's  view  is  that  some  trials  may  be  required  to  estab¬ 
lish  a  reliable  association  between  some  encoding  of  the  stimulus  and  the 
response,  and  then  some  further  trials  may  be  required  to  stabilize  the 
encoding.  This  interpretation  of  Martin's  hypothesis  is  very  similar  to  the 
hypothesis  proposed  in  the  present  article.  As  nearly  as  I  can  tell,  the 
evidence  that  is  presented  here  does  not  differentiate  between  Martin's 
idea  and  mine,  and  the  two  ideas  may  be  different  expressions  of  the  same 
hypothesis 

The  present  hypothesis  of  storage  and  learning  to  retrieve  also  closely 
resembles  Feigenbaum's  (1963)  model  of  memorizing  incorporated  in  the  program 
F.PAM ,  and  llintzman's  (1968)  extension  of  this  work  in  the  progran  SAL  In 
EPAM  and  SAL  the  early  phase  of  learning  is  called  image  building,  and  its 
effect  is  to  store  a  partial  representation  of  the  stimulus  and  a  represent¬ 
ation  of  the  response  in  memory  The  later  phase  of  learning  permits  the 
subject  to  discriminate  among  the  stimuli  in  the  list,  and  therefore  to 
permit  reliable  retrieval.  Thus,  I  see  no  important  difference  between  the 
hypothesis  offered  here  and  Feigenbaum's  and  llintzman’s  hypotheses  for  new 
learning  On  the  other  hand,  EPAM  and  SAL  might  lead  to  predictions  about 
A* By  transfer  that  differ  from  the  hypothesis  about  storage  and  retrieval 
that  was  developed  based  on  James'  experimental  results 
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